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Abstract 


Let n be a class of functions on a probability space (O, p.) and let 
Xi,...,Xk be independent random variables distributed according to 
fj,. We establish high probability tail estimates of the form sup^g^ |{i : 
|/(Wi)| > t} using a natural parameter associated with F. We use this 
result to analyze weakly bounded empirical processes indexed by F 


and processes of the form Zf = 


for p > 1. 


We also present some geometric applications of this approach, based 
on properties of the random operator L = where 

the are sampled according to an isotropic, log-concave measure 


on . 


1 Introduction 

Empirical Processes theory focuses on understanding the behavior of the 
supremum of the process 


f^Zf = 


k 

lj2f(Xd-Ef 


where F is a class of functions on a probability space (n,/r), f G F and 
{Xi)i=i are independent random variables distributed according to /r. Let 
pLk denote the random empirical measure k~^ Ei=i for a class F we 

denote the supremum of the empirical process indexed hy F hy \\nk — fJ-Wr- 
Often, one would like to bound this supremum using geometric properties 
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of the set F, but the question we tackle here is slightly different; our aim is 
to bound the supremum of the empirical process indexed by powers of the 
class F, that is, the supremum of the process indexed by the set F^ = {\f\^ : 
f € F} for p > 1 using the geometry of the set F rather than the geometry 
of FP. The difficulty arises when elements in F are not necessarily bounded 
functions, or in cases where the Loo bound is weak - while the situation 
is considerably simpler in the bounded case. For example, if F consists of 
functions bounded by 1 then the empirical process indexed by FP can be 
bounded using a combination of symmetrization and contraction arguments. 
Indeed, by the Gine-Zinn symmetrization method (see, for example, OHS]), 


IE||//fc — pWfp <2E sup 
f&F 


<2pE sup 
f&F 


i=l 

k 


2 = 1 


where are independent, symmetric {—1, l}-valued random variables. 

The last inequality is evident from a contraction principle puni and the 
fact that \x\P is a Lipschitz function on [—1,1] with constant p. 

Moreover, for a class of uniformly bounded functions, the supremum of 
the empirical process \\pk ~ is highly concentrated around its mean, as 
the following theorem, due to Talagrand, shows. 

Theorem 1.1 Let F be a class of mean zero functions defined on 

(n, p) such that for every f G F, ll/lloo <5. Let Xi,Xf; be independent 
random variables distributed according to p and set = ks\xp^^p\ai{f). 
Define 

k 

Z = sup^/(Xi), Z = sup 
Then, for every x > 0, 

Fr ({|Z - EZ| > U) < c. exp (-^ log (l + . (l-D 

where ci and C 2 are absolute constants. The same inequality is also true 
when Z replaces Z in CU). 

Unfortunately, in many applications the function class at hand does not 
consist of uniformly bounded functions, or even if the functions are, the 




2 = 1 
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uniform bound is very bad. One such example which motivated this study is 
the class of linear functionals of Euclidean norm 1 on M”, and the variables 
Xi are distributed according to a Borel measure on M"' which is natural 
from the geometric viewpoint, namely, a measure which is isotropic and 
log-concave. 

Definition 1.2 A probability measure p on M” is called isotropic if for every 
y £ M”", j \(^x,y)\‘^dp{x) = ||y|p. The measure p is log-concave if for every 
0 < A < 1 and every Borel measurable A,B G M"’, p{\A -|- (1 — X)B) > 
p{A)^p{B)^~^, where A-\- B is the Minkowski sum of A and B. 

A question of particular interest in this case can be formulated as follows: 


Question 1.3 Let p be an isotropic measure on M”' and let Xi,...,Afc be 
independent, distributed according to p. Given T C M”, for every 0 < e, 5 < 
1 and p > 1, what is the smallest integer ko such that for every k > ko, with 
probability at least 1 — 5, 


sup 

t&T 


k 


< e? 


Two simple examples which come to mind are when p = 2, T = 
and p is the Gaussian measure on M” or the uniform measure on the vertices 
of the unit cube. 


Example 1.4 For every t £ M”" dehne the linear functional ft = (t, •) 
set F = {ft : t £ 5'"'“^}. Let pc be the Gaussian measure on R” and note 
that for every t £ Kf^ = 1. Then, 


= sup ^\\Ttf - 1 , 

where T is a random k x n matrix with independent, standard Gaussian 
random variables as entries. Hence, if \\pk — p\\f'^ < the gaussian matrix 
is an almost isometric embedding of ^2 which is a well known and 

useful fact and occurs as long as k > c{e,5)n (see HHI)- Another example is 
when p = pRis the uniform probability measure on { — 1,1}”. Thus, if \\pk — 
pWf^ — ^ then a random kxn matrix with independent, symmetric, { — 1,1}- 
valued entries is an almost isometric embedding of Ilf ^ 2 - Unfortunately, 
functions in F on the probability space {MA,pg) are not bounded, while on 


\\pk - p\\f 2 = sup 


teS" 




- 1 


2=1 
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the best uniform L^o bound is sup^g^n-i ||/t||oo < which is too 
weak to be useful. Therefore, symmetrization and concentration methods 
which are so helpful in the bounded case can not assist in resolving Question 
oi here, as well as in other, more general examples we will explore. 

The useful property of linear functionals (with respect to both and 
fin) is that for every ft G F, 

Pr {\ft\ >u) < 2 exp (—cu^) 

for a suitable absolute constant c, implying that functions in F exhibit 
a subgaussian behavior. Moreover, using Borell’s inequality El d , one 
can show that if /r is an arbitrary isotropic log-concave measures, linear 
functionals exhibit a sub exponential decay. 

To formulate these decay properties in a more accurate way, we require 
the definition of Orlicz norms Eld- 

Definition 1.5 For a > 1 the ^l^a norm of a random variable Y is defined 
by 

\\Y\\^^ = inf {u > 0 : Eexp(|y|"/u“) < 2} . 

It is standard to verify that if Y has a bounded norm then Pr (|y | > t) < 
2 exp(-ct"/||y||;^^) where c is an absolute constant. The reverse direction is 
also true, and if Y has a tail bounded by exp(—then ||y||. 0 „ < ciK. 

Out main goal is to show how decay properties of individual class mem¬ 
bers can be combined to control \\^k — hWpp- 

As a starting point, let us consider the linear case where is addition, 
functionals are subgaussian with respect to the Pf norm, i.e. for every y G 
M"", ||(y,X)|Q 2 < c||?/|| 2 . In particular, the diameter of F = 5"'“^ is bounded 
with respect to the V ’2 norm. 

This fact by itself is not enough to bound \\fik — hWpp, and to that end 
we require the following notion of complexity of the class F. 

Definition 1.6 For a metric space {T,d), an admissible sequence of 
T is a collection of subsets ofT, {Tg : s > 0}, such that for every s > 1, 
|r<j| =2^ and ITqI = 1. For f3 > 1, define the jp functional by 

CO 

7 ^(r, d) = inf sup ^ TQ, 

where the infimum is taken with respect to all admissible sequences of T. 
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In 112 ] the question of estimating ||/ifc —//||^2 has been studied for sets of 
functions which have a bounded diameter with respect to the 'i/’2 metric and 
a finite 'y2{F,ip2), under the additional assumption that for every f £ F, 

Ep = 1 . 

Theorem 1.7 m There exist absolute constants ci,C2,C3 and for which 
the following holds. Let (O, fj.) be a probability space, set F to be a subset of 
the unit sphere of L2{p) and assume that diam(F, ■02) = «■ Then, for any 
6 > 0 and k > 1 satisfying 

cia72(F,-02) < O^/k, 


with probability at least 1 — exp(—C20^A;/a^), \\fJ.k — tWf'^ ^ Moreover, if 
F is symmetric, then E\\p,k — /r||j?2 < 03072(1^, 02)/\/fc. 


Theorem 11.71 gives an answer to Question II.ill for p = 2 under a 02 assump¬ 
tion in a very general situation. It is particularly helpful when the 02 metric 
endowed on F is equivalent to the L2 metric, that is, if for every f,g £ F, 
11/ “ 9\\ip2 — ^\\f ~ 9 \\l2- such a case, diam(F, 02) ~ diam(F, L2) and 
72(T, 02) ~ 72(F, L2), where by ^ ~ i? we mean that there are absolute 
constants c and C such that cA < B < CA. By the majorizing measures 
Theorem (see m for the most recent survey on the subject), 72(T, L2) is 
equivalent to the expectation of the supremum of the Gaussian processes 
indexed by F, denoted by EUGIIf. Therefore, under a 02 assumption. The¬ 
orem ^ 7 ] implies that if T C S{L2) then for every 0 < d < 1, with probability 
at least 1 — 5 , 


\\Tk - 


. E||G||^ 
^ c- 

Vk 


where c depends on 5 and on the equivalence constant between the 02 and 
L2 metrics. 

In the geometric context of Example 11.41 Theorem o is helpful when 
the indexing set in an arbitrary subset of 5 ”“^. Moreover, if the measure 
pL happens to be isotropic, then the Gaussian process indexed by F is the 
isonormal one and thus 72(T, L2) Esupig-r \YJi=i9iti\, where gi,..,gn are 
independent, standard Gaussian variables. 

Unfortunately, the assumption that the 02 metric is equivalent to the 
L2 metric is overly optimistic. In particular, the class may not have a well 
bounded diameter in 02, or the diameter could be of the same order of 
magnitude as 72(E, 02). For example, if fi is log-concave and isotropic, then 
for every y £ M”, the function fy = (y,-) satisfies \\fy\\p,^(p) < it'H/yH lzQ) 
and the 0 i and L2 norms are equivalent on M"', but in contrast, the 02 
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diameter of might be polynomial in the dimension (e.g. ^/n when /r 

is the normalized volume measure on the isotropic position of the unit ball 
of ii). Hence, the bound one can establish from Theorem o is useless in 
such cases because of the way it depends on the ip2 diameter of the set. 

It would be desirable to prove a result of a similar flavor to Theorem II .71 
with the '02 diameter of F replaced by the 0 i diameter and also removes 
the restrictions that p = 2 and that T C S{L2). Our main result implies 
just that. 

To see why the 0 i case is considerably more difficult than the 02 one, 
consider a single function h £ L^-^. By Bernstein’s inequality iLemma 12.21 
below), empirical means of h are highly concentrated around its expectation, 
with a tail which decays exponentially in sample size. Clearly, if a function 
/ £ then and hence exhibits the degree of concentration 

needed in the proof of Theorem 11.71 On the other hand, if / £ the 
degree of concentration of empirical means of around E/^ is not strong 
enough for that approach. 

To overcome this obstacle, the method we suggest here is to decompose 
F to two subsets Fi and F2 which satisfy that F C Ti + F2 . 

Fix 0 {k) > 0 and consider the sets Fi = {/l||j|<5)j : / £ F} and F2 = 

Since all the functions in Fi are bounded by 9 , the empirical mean Pkif) 
is highly concentrated around the true mean for any f G Fi and \\pk — fJ-WPi 
(or Wpk — using a contraction argument) is well behaved. The key 

point in this approach is to control the “large part” of the process, namely, 

k 

supA:“^^|/|^’l{|/l>0}, 

7^1 

and to show that the supremum is small even for a relatively low level of 
truncation 9 . The reason this supremum is small has nothing to do with the 
concentration of each individual class member around its mean, but rather 
with the fact that with high probability, all the functions in F have an 
empirical distribution which decays quickly. And indeed, the main Theorem 
we present is an “empirical processes” version of result due to Bourgain on 
the distribution of functions in F with respect to the (random) empirical 
measure pk- 

Theorem A. There exist absolute constants ci, C 2 and C 3 for which the 
following holds. Let F be a class of mean zero functions on (fl,^). For 
every vi,V2 > ci, with probability at least 1 — exp(—C2 min{ui, ^2}), for any 
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f € F and t > 0 


|{i : |/(-’^i)| > i}| < max | '^‘^\ ek exp ( -| , 

I F V C3Q!t>2 / J 

where a = diam{F,'ipi). 

Bourgain’s argument is very different from ours and is tailored to 
the specific case F = {(?/,•) • U S 5’"’“^}, where Xi,...,Xk are selected 
according to a log-concave measure on M” (see Section |31 for a more detailed 
discussion). 

The proof of Theorem A is based on the following estimate (which will be 
shown to be optimal) on the ii structure of a random coordinate projection 
of F. 

Theorem B. For every 0 < 5 < 1 there is a constant c{5) for which the 
following holds. For every integer k, with probability at least 1 — 5, for every 
f ^ F and I C {1, k}, 


^|/(^i)| < c(5) ^v1^72(i^,V’2)+ diam(F,V'i)|/|log > 



We present several geometric applications of Theorem A. The first of 
which is a “log-concave” version of the celebrated result of Pajor and Tomczak- 
Jaegermann HZI on sections of small diameter of a convex, symmetric body 
K (see also [13113 El for results along the same lines). We show that 
if Xi,...,Xk are selected according to an isotropic log-concave measure on 
M"’, then with high probability, the intersection of the kernel of the operator 
T = '^i^i{Xi, ■'jci with K will have a small diameter. 

Theorem C. For every 0 < 5 < 1 there exists a constant c{6) for which 
the following holds. Let /r be an isotropic, log-concave measure on M"' and 
let K C M” be a convex symmetric body. If Xi,...,Xk are independent, 
distributed according to fa, then with probability at least 1 — 5, 

diam(keiT n K) < ql,{K), 


where 

ql{K) = inf |p > 0 : p > c(5) ^^^^ — | , 
and Vp = 72 (A 1 n p5"'“^,'02)- 
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If // is a subgaussian measure, Theorem C gives a weaker result (by up 
to a factor of \/log n) and with a weaker probability estimate than Theorem 
o On the other hand, it is applicable for a wider set of measures. 

The downside of our approach is that it depends on the parameter 
72(T,'^2) which is often hard to bound. However, as we show, a completely 
'ijji version of Theorem B is not true and one might have to use the addi¬ 
tional structural assumptions on the indexing set to improve our estimate. 
Luckily, in the case F = {ft : t G 5 '"'“^}, it is possible to bound \\fik —t^Wpp in 
a rather strong sense (though probably suboptimal by a logarithmic factor) 
using a truncation of the measure /i. Let /i be a probability measure on M”, 
for every integer k, let Xi, ...,Xk be independent, distributed according to 
fj. and set = Emaxi<j<fc ||Xj||. Observe that if 1 ) = -^il{||x||<ci{5)Hfc}> 
then with probability at least 1 — 5 , Xi = Yi for \ < i < k. Thus, one can 
consider the process — v\\fp instead of the original process \\^k — k-Wpv- 
Moreover, one can show 

Theorem D. There exist absolute constants ci, C2 and C3 for which the 
following holds. If F = {ft : t G 5 '"'“^} then 

72(F,-02(2^)) < CliLfcA/log n. 


Note that if is an isotropic log-concave measure on M” and if n < A: < 
exp(c2\/n) then < c^y/n., which is a fact recently proved by Paouris 

As we demonstrate in Section 0 the combination of Theorem B and 
Theorem D allows us to bound 

k 

{'£\(t,x,)r-E\(t..x)r 

i=l 

for any log-concave measure. 

2 Preliminary Results 

In this section we present basic results which are used throughout this article. 
First, a notational convention. All absolute constants are positive numbers, 
denoted by c,ci,C2,.. etc. Their value may change from line to line. We 
denote the Euclidean norm by || ||, while all other norms will be clearly 
specihed. 

There are several useful results regarding the concentration and tail be¬ 
havior of sums of independent random variables. The first one we present 


sup 
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here deals with subgaussian random variables and can be easily seen using 
the moment generating function. 

Lemma 2.1 There exists an absolute constant c for whieh the follow¬ 
ing holds. Let X be a subgaussian random variable and let Xi,...,Xk be 
independent, distributed as X. Then, for every a = (ui, ...,ak) G 

k 

II ||i/)2 ^ c|| ||i^2 I|q|| • 

i=l 

If X is not a random variable and only exhibits a subexponential tail 
then Bernstein’s inequality describes the way the average of independent 
copies of X concentrate around their mean - with a tail which is a mixture 
of subgaussian and subexponential. 

Lemma 2.2 \ 25 ^ There exists an absolute constant c for which the following 
holds. Let Xi, ...,Xk be independent copies of a mean zero random variable. 
Then, for any t > 0 , 



It turns out that using the generic chaining method m combined with 
Lemma rm or Lemma one can bound the supremum of the empirical 
process indexed by F. 

Theorem 2.3 124 ^ There exists an absolute constant c for which the fol¬ 
lowing holds. If F is a class of functions on (fl,//), then for every integer 
k. 


K\\fJ.k - h\\F < c 


l2{F,f}2) 
Vk ' 


MLk - hllF < c --^=— H- - - 


and similar bounds hold with high probability. 


In many cases, computing the 7 functionals is a difficult task. It is 
possible to upper bound them using a metric entropy integral, similar to 
Dudley’s integral in the context of Gaussian process. 
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Definition 2.4 Let {T,d) be a metric space. The covering number of T at 
scale e is the minimal number of open balls (with respect to the metric d) of 
radius e needed to cover T. The covering numbers of (T, d) are denoted by 
N{e,T,d). 

Since one way of forming an admissible sequence for (T, d) is to use an 
almost optimal cover (the set Tg is a cover at the scale at which one needs 
2^° balls to cover T), the following is evident: 

Lemma 2.5 There exists an absolute constant c for which the following 
holds. Let (T, d) be a metric space. Then, 

poo 

l 2 {T,d) <c v^logiV(e,r, d)de. 

Jo 

A much more difficult result, due to Talagrand | 22 [ I24| . is that if T is a 
unit ball of a 2 -convex normed space, 72 could be bounded from above by a 
sharper version of the entropy integral. 

Definition 2.6 A Banach space is called 2-convex if there is p > 0 such 
that for ||x||, ||y|| < 1 , ||x + ?/|| < 2 - 2p\\x - yf. 

Theorem 2.7 For every p > 0 there exists a constant c{p) for which 
the following holds. IfY is a 2-convex Banach space with parameter p and 
if the metric d is given by some other norm \ \, then 

elogN {By, B\ \,e) dsj . 

Theorem is used in the case Y = the n-dimensional Euclidean space, 
where d is the metric endowed on M"' by the 1(2 norm (see Section @J. 

3 Decomposing classes of functions 

Let F be a class of functions on the probability space (O, p) and assume 
that for every / G F, E/ = 0. 

Let us formulate the main technical tool we require. 

Theorem 3.1 There exists absolute constants ci and C 2 for which the fol¬ 
lowing holds. Let F be a class of mean zero functions on (fl, p) and set 
Xi,...,Xk to be independent random variables distributed according to p. 


72 (Fy,d) < c{p) 
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Then, for every vi,V 2 > ci, with probability at least l—exp){—C 2 ra.\n.{v\,V 2 ]), 
for every I C {1, k}, 


sup V/(Xi) 


< vi-\/\T\'y 2 {F,'ijj 2 ) + U 2 diam(F,'!/;i)|/| log 



Theorem Em has a similar version in which one assumes that the set of 
functions is well bounded in xf 2 - 


Theorem 3.2 There exist absolute constants ci and C 2 for which the fol¬ 
lowing holds. Let F and Xi,...,Xk be as in Theorem AH . 1\. Then, for every 
V > Cl, with probability at least 1 — exp(— C2U^), for every I C {1, k}. 


sup '^f{Xi) 


< V 


+ diam(F,V’ 2 )|^| 



Theorem rm is an empirical processes version of a lemma due to Bour- 
gain (Ej, see also j^) which deals with the case when F is S^~^, considered 
as a class of linear functionals on and /U is an isotropic log-concave mea¬ 
sure. Unlike Bourgain’s argument, which relies heavily on the fact that the 
functions in the class are linear functionals and on that the indexing set is 
the whole sphere, Theorem 13.II is very general. 

Observe that if the L 2 and ■02 metrics are equivalent on F with a constant 
(5 and if E||G||ir denotes the expectation of the supremum of the Gaussian 
process indexed by F, then by the majorizing measures Theorem there are 
absolute constants c and C and a constant ci{j3) depending only on (3 such 
that 


ci(/3)72(T,V^2) < c-i2{F,L2) < IE||G||f < C^2{F,L2) < C^2{F,if2). 


Therefore, by Theorem EH with probability at least 1 — 5, for every I C 
{1, ...,k}, 


sup 


VI 

/6F 

i&I 


sup 


<c(5,/3) ( 

/6F 

i&I 

\ 


v17[E||G||f + diam(F,V’i)|/| log > 


v1^E||G||i. + diam(F,V^2)|/| 
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Let us point out that it is impossible to obtain a fully V'l version of 
Theorem o Indeed, suppose the converse was true, and that for every set 
F and integer k, with probability at least 1 — d, for every I C {1,..., fc}. 


sup V/(Xi) 


< c{5) +diam(F,V'i)|/| log 


(3.1) 


Let Y be an exponential random variable and let g £ 

^ Vi°g(*+L 

where (ej)(L;^ is the standard basis in and (Yi)^^-^ are independent 
copies of Y. Setting to be the measure on R" which endows (j3.1 B 

can not be true for and Fn = Bf, the unit ball in even when k = 1. 

Indeed, using Borell’s inequality (see, e.g. m) or by a direct computation 
as in pp, it is evident the for every t G R*^, 




bi 


< c 




= c 


E 


i? 


1/2 


^2 \^^l0g(/+l)^ 



where | is the weighted Euclidean norm with weights (Y^log(/ + l))(Li 
and c is an absolute constant. Hence, by the majorizing measures Theorem 
and a standard computation, there are absolute constants c, ci and C 2 such 
that for every n. 


72 (^Fn,'lpl{n^"'^)^ < C72(i4, 


(")) < ciE sup 
t&B’l 


Y,9i 


V^log(i + 1) 


< C 2 . 


Therefore, if (jd.!!) were correct for k = 1, it would follow that with prob¬ 
ability of at least 1/2, sup^g^n^^t, < C 3 , for a suitable C 3 which is 

independent of n. On the other hand, an easy computation shows that with 
probability larger than some constant C4, 

sup (t, X^"'^) > v^log (n -b 1), 

and thus it is impossible to get a completely 1/1 version of Theorem 13 .II 
Next, observe that Theorem 10 is optimal, in the sense that both the 
72 term and the term that depends on the 1/2-diameter are required. To see 
this, fix an integer k and let 1 < i < k. Set F = {a, —a} C 5 "'“^, acting as 
linear functional of R”’ and let X = {gi,...,gn) be a Gaussian vector in R”. 
With this choice of X, the 1/2 aud £2 metrics on R”’ are equivalent with an 
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absolute constant, and thus 72 (-F, i’ 2 ) < c'y 2 {F, £ 2 ) = ci. On the other hand, 
writing a = (ai, ...,an), for every 1 < £ < k, 


sup sup 
f&FieE^ 



= sup 

n 

^3 did 

iei 

leEi 

i£l j = l 


which is the supremum of the Gaussian process indexed by 


Ee = {I : /C A:}, \I\ = £} 


with the covariance structure endowed by the Hamming metric on Ei, given 
by = \I A Recall the well known entropy estimate for Ei 

with respect to this metric (see, for example, HH): 


Lemma 3.3 For 0 < A < 1/2 there exists a constant c\ for which the 
following holds. For every integers k and 1 < £ < k, there is a subset 
P C El which satisfies that log |P| > (1 — A)£log (caj) and if I,J € P and 
I ^ J then dnil, J) > '/M. In other words, 

log N (^Ei,'/M,dH'^ >(l-A)£log 


Combining Lemma [3.3l for A = 1/4 with Sudakov’s minoration (see, e.g. [H]), 
it is evident that 


E sup 

I&Et 


n 

1 / 

yy ^jdid 

iei j=i 

> c£j log ( 


' ck 


proving that the second term in Theorem 13.21 is indeed necessary. 

To show that the 72 term is necessary, let T = {—1,1}" acting as linear 
functionals, and again set X to be the Gaussian vector on R”. Then, for 
every 1 < £ <k, 


sup sup 
aef-l,!}** I&Ee 


n 


i n 

yy y^did^j 

> sup 

yy yy did^j 

iei j=i 

aef-l,!}** 

i=i j=i 


The latter is the supremum of the Gaussian process indexed by {—1, !}”■ 
with the covariance structure given by the metric d{u,v) = — v\\in. 

Thus, it is standard to verify that 


E 


sup 


EE». 

lei i=i 




> 


n. 
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On the other hand, diam({—l, 1}", V' 2 ) < cdiam({—1,1}", £ 3 ) < C\/n. Thus, 

the upper bound from Theorem I.S.2M s of the order of \/Zn + y^^t’log(e/c/£), 
showing that the 72 term can not be removed from the bound, 
proof of Theorem 13.11 To control supjgj;'/(Xj)|, consider the 
following k processes. Recall that for every 1 < i < k, = {I : I C 
{ 1 ,..., /c}, \I\ = i} and define the random process 


^f= sup 

leEe 




iei 


where Xi, ...,Xk are independent random variables distributed according to 
/r. 

Fix 1 < f < fc (the result for i = k requires minor changes and is omitted) 
and consider the process Zj. Observe that for every f,g G F, 


Pr 




< Pr sup 

Yp(f-9)iXi) 

\l&Ee 

i&I 



< mPr 


I 

Y.{f-g){x,) 

i=l 



< ‘2\Ep 


exp 



where c is an absolute constant. 

Without loss of generality, assume that 72 (T, ■02) < oOj let {Fs)s>o be 
an almost optimal admissible sequence for the metric space (T, ■ 02 ) and set 
TTsif) to be a nearest element to / in Fg with respect to the 02 metric. Thus, 
\Fs\ < . Dehne sq as the first index such that 2*°“^ < log \ Ei\ < 2*°, and 

note that 

00 


zi = zi 

J TVs 


,(/) 


+ 


E 

t=So 


z: 


^i+i(/) 


- 




Fix ti > 0 to be specified later and s > sq, and consider tg = nv^||7rs+i(/) — 
'^s{f)U 2 ‘^''^‘^^yiog\Ei\. Then, 


Pr 


yi yl 


>tg'^ < 2 |Fl£| exp(—ctt^ 2 ^ ^loglFlfl) 

< 2 exp (—cloglFl^l “ l)) • 


Take u = vij -^/log \Ei\ for vi > ci and note that 2^ > log ||, implying that 
the tail is upper bounded by 2exp(—C 2 uf2'^). Summing over sq < s < 00 it 
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is evident that with probability at least 

OO 

1 — 2 exp(—C 2 'Ui 2 ^) > 1 — 2 exp(—C 32 ^°t;^) > 1 — 2 exp(—log \Ei\), 
so 

for every f € F 


pl+i(/) 

i=so+l 


- Z: 




< VlVI ^ 2^/^\\TTs+l{f) 

i=so+l 

< CiVi'/I^2{F,'lp2)- 


^s(/)IU 2 


To handle Fg^ = {T^soif) ■ f ^ F}, note that the cardinality of this set is 
at most 2 “^“° < Applying Bernstein’s inequality iTvemma, r2.2jl . for 

every f > 0 and every f G F, 


Pr 


> u\ < \EtlPr 


E 


2 = 1 

< 2|£'£| exp —cfmin 


> ti 


2 ’ 
bi 


Let t = ||/||^^U 2 for V 2 > 1. Since 1 < ^ < k then i ^ log \F(\ > 1 and 

tP ll/IIV’i- Therefore, with probability at least 


1 - 2 |£'£|^exp(-C 5 U 2 log|£'£|) > 1 - 2 exp(-log (C 5 U 2 - ce)), 


for every f G Fg^, 

Zj < U2||/||^i loglLl^l < U2diam(F,V'i)log|L;£|. 


To conclude, there are absolute constants C7, cg and cg such that for every 
1 < £ < A:, if ui, U2 > C7, with probability at least 1—2exp(—cg log {F^l min{u^, U2}), 


sup \ Zi 
f&F 


< 


Cg (^viVI'y 2 {F, V' 2 ) + U 2 diam(F, -i/’i) log 


Summing the probabilities, the latter holds for every 1 < £ < A: with proba¬ 
bility at least 1 — exp(—cio min{uf, U 2 })) completing the proof. ■ 

The proof of Theorem I:L2I is similar and is omitted. 

Proof of Theorem B. In the V’l case, take vi = y^log(l/(5) and ug = 
log(l/(5) for 6 small enough. Fix any f G F and for I G let = {i '■ 
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f{Xi) > 0} n I and I (/) = {i : f{Xi) < 0} n /. Then, by Theorem IT 11 
with probability at least 1 — < 5 , 

Y^\f{X,)\ = \ ^ f{X,)\ + \ ^ f{X,)\ 
i&i *e/+(/) *e/-(/) 

< c{8) + dia.m{F,'ijji)nog ’ 

as claimed. The ip 2 case is equally easy. ■ 

For Theorem EH one can derive the following uniform empirical tail 
estimate for functions in F, which was formulated as Theorem A in the 
introduction. 

Corollary 3.4 There exist absolute constants ci, C 2 and C 3 for which the 
following holds. Let F be as in Theorem, \d.l\ For every vi,V 2 > ci, with 
probability at least 1 — exp(—C 2 min{u^, U 2 }), for any f G F and t > 0, 

|{i : \f{Xi)\ > t}| < max | '^‘^\ ek exp ( -| , (3.2) 

I V c^av2 J J 

where a = diam{F,'ipi)- 

Proof. Fix vi,V 2 as in Theorem EH and consider the set for which the 
assertion of Theorem EH holds. Let (Xi,X^) be in that set and for 
f G F and t > 0 put 

It{f)={i:\f{Xi)\>t}. 

Setting a = diam(F, V'l) there are two possibilities. First, if 
20 ^ 21 I log ^ 

then by Theorem 13. IL 

t\It{f)\ < 2viy/\It{f)\72{F,'ilj2) + 2 v 2 a\It{f )\log 
<2uiv1W^72(F,V'2) + ^|/i(/)|. 

Thus, t\Itif)\/2 < 2vl^/\It{f)\-/2{F,^p2), implying that 

\It{f)\ < 16 Ui- ^ -. 
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Otherwise, 2av2\It{f)\ ^og{ek/\It{f)\) > t\It{f)\/2, or in other words, 


< e/cexp - 


4r)2diam(F, ipi) 


Now we are ready to formulate and prove the main theorem of this 
section, which is a decomposition result for the class F. 


Theorem 3.5 There exist absolute constants ci, C 2 and C 3 , and for every 
1 < p < 00 there exists a eonstants Ci{p) for which the following holds. 
Let F be a class of mean zero funetions. For u > ci, ^4 > 72 (F, ' 1 P 2 ), B > 
diam(F, ?/)i) and an integer k, set 


9 > max 


|c2uFlog + ij ,C2pBlog{c2pB + 1)| 


Then, there are Lipschitz functions c;/!) : M ^ R and -0 : R —> R which depend 
on 9, such that ||i?i||iip, HV'llup < 1 setting Fi = {</>(/) : f G F} and 

F2 = Wf) 

1. F C F 1 + F 2 . 

2. For every h G Fi, ||h||oo < 9 and for every h G F 2 , E,\h\P < A^/k . 

3. With probability at least 1 — exp(—csu), 

k 

sup \h{Xi)\P < C 2 vA^ {9P~‘^ + Kp) , 
where Kp = C 4 (p) 0 ^“^ for p < 2, K 2 = 04 ( 2 ) log A, while for p > 2, 

Kp = Ci{p)AP~‘^. 

Theorem 13.51 implies that F can be decomposed into two simple sets 
Fi and F2 (which depend on k,p and v). The fact that these sets are as 
simple as F is evident because they are images of F via Lipschitz functions 
with constant 1. In particular, '^p{Fi,d) < 7 / 3 (F, d) with respect to any 
reasonable metric d. The sets Fi and F2 have additional properties. Fi 
has a bounded diameter in Lqo - up to a logarithmic term, its diameter 
in Loo is proportional to the ipi diameter of F. Thus, if F has a well 
bounded diameter with respect to the ipi metric then functions in Fi are 
highly concentrated around their means, and one can safely use a contraction 
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argument when bounding the empirical process indexed by a power of Fi. 
The main difficulty is in controlling the “large part” of T - i.e. F 2 . The 
empirical process indexed by F 2 is small not because of concentration, but 
because the ip diameter of a random coordinate projection of F 2 and its Lp 
diameter are small. 


Proof of Theorem 13.51 Fix an integer k and v for which Corollary 13.41 
holds. The first step is to select the Lipschitz functions (p and those 
are simply truncation functions at the level 9. For f € F, set (p{f) = 
sgn(/) min{|/|, 0 } and V’(/) = f ~ 4'if)- Clearly, both functions have Lips¬ 
chitz constant 1, F C <^(T) -|- V'(T), and for p > 1, because 4>{f) and V’(/) 
are supported on disjoint sets, 


\f\P = min{\f\P,eP} + {\f\P-Bni{\f\>e} 


Let A > ^ 2 {F,'ip 2 ) and B > diam(T, ■i/’i). It is evident that 

E|/ri{|/|> 0 } < ci( 2 p.Bf exp ^ (3-3) 


as long as 


9 > {c3pB) log{c5pB) + csBlog 



(3.4) 


which is satisfied by our choice of 9. Thus (2) is established. 

Turning to (3), recall that for every t > 0 and f G F, It{f) = {* : 
> t}. By our choice of v, with probability at least 1 — exp(—C 4 n), 
for every / G F, for every t > 0 


< max 


—p—,fcexp 



(3.5) 


Therefore, if 

t > to = ceBvlog{ceBvVk/A) (3.6) 

for a suitable absolute constant cq, the first term in (13.51) is dominant. Note 
that if t > max{to, Ac^vA} then |/t(/)| = 0, and since 9 > to then by a stan¬ 
dard integration argument with respect to the random empirical measure pk, 
with probability at least 1 — exp(— 04 ?;), for every f G F 


EcJ/l'ld/ISO) < O-Pr,. 

vA^ 
< co-r- 


i\f\>d)+ / ptP-^Prp^ (I/I > t) dt 

Je 

\ 

9P-^ + J ptP-^dtj, 
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for which the claim follows. 


Remark 3.6 Note that Theorem ro enables one to bound \\fik — hWPi o-nd 
thus Wfik - t\\f- Indeed, pointwise, - {4'{9)T\ < - ^1, 

implying that 

72 f , ip2) < cpOP~^'y2{F, ip2)- 

By a standard generic chaining argument (see Theorem, \ 2., ‘A and l^), for 
every v > 0, with probability at least 1 — exp(— 


sup 

/ 6 F 


k 


^ 72(((/)(F))P,V'2) 

- Tk - 


<csp6^ 


'y2{F,ip2) 

\/k 


Combining this with Theorem 13.51 it follows that with probability at 
least 1 — 2exp(—ciu), 


Whk - h\\F < C2V p6P 


o-ll2{F,ll)2) , 7|(F,V’2) 


^/k 


+ 


k 




ip-2 


+ K, 


+ l)^ . 


Remark 3.7 Observe that by (13.31) . supki^p^M\h\P < (cpR)^exp , a 

fact we shall use below. 


We end this section with another observation which follows easily from 
the proof of Theorem 13.51 To avoid complications, we will formulate it only 
is the case we need it, which is when T is a class of linear functionals 
on and /i is a measure on M"". Consider the random variable U = 
supjgjT’|(/,X)|, and for every integer k set = Emaxi<j<fc C/j, where 
are independent copies of U. 

Theorem 3.8 For every p > 1 and 0 < <5, e < 1 there are constants 
ci{ 6 ,e,p), € 2 ( 6 ,p) and C 3 {p) for which the following holds. Let F and p 
be as above, consider the random variable = ^l{[/<ci(( 5 ,£,p)///i,} ond let v 
be the probability measure on M"' corresponding to . If A > ^ 2 {F,'iI) 2 {f)) 
and B > dmm(F,'ipi{n)), then with probability at least 1 — 5, 

\\Pk - mIIfp < C2 {9P-^ + «p)) + ( 3 . 7 ) 
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where 


6 = max <j C 2 B log ( 1 ) > ^ 2^-8 log(c 2 p 5 + 1 ) , 


and Kp = 1 for 1 < p <2, K 2 = log and Rp = H? for p > 2. 


Because the proof is based on the same arguments used in Theorem Id. 51 and 
Remark EHl we will only give a brief sketch of the required modifications 
which are that with high probability, Xi = Yi for 1 < i < k and that by the 
Cauchy-Schwarz inequality, 

sup |Ep|/|P - E^\f\P\ = supE\f\Pl^u>c^(^s,e,p)Hk} < C 3 {p)B^^^e, 

f&F f&F 

for the right choice of constants. Thus, one can replace the measure p with 
the measure u and consider the empirical process \\f — f^IIfp instead of 

\\h - hk\\FP- 

The advantage of using the measure f is that it is a truncated version of 
p at the “correct” level for F and the sample size k. This truncation enables 
us to bound 72 (*S'”“^, u), where v is a. truncation of an isotropic, log-concave 
measure on M”. 


4 Applications 

The first geometric application we present deals with sections of small diam¬ 
eter of a convex, symmetric body. Let T = ')^ii where Xi, ...,Xk 

are selected according an isotropic log concave measure. As we show below, 
if AT is a convex, symmetric body in M”, then with high probability, the di¬ 
ameter of Arnker(r) is small. This extends a celebrated result of Pajor and 
Tomczak- J aeger maim m which was proved in the case where the random 
subspace was selected according to the Haar measure on the Grassmann 
manifold Q{n,k), but the same proof works in the Gaussian case. Various 
versions and extensions of this result may be found, for example, IIS1II3II1. 

The following theorem is a formulation of version of this result for a 
general 1/^2 operator (see m)- Let us introduce the following notation: for 
a set T C M”" we denote by £^{T) = Esup^g^^ the expectation 

of the supremum of the gaussian process indexed by T. Recall that by the 
majorizing measures Theorem [24] . there are absolute constants ci and C 2 
such that for every T C M”, 

Ci72(r,|| ||)<G(r)<C272(T,|| II). (4.1) 


20 



Theorem 4.1 m There exists a absolute constant c and ci for which the 
following holds. Let Xi, ...,Xk be distributed aceording to an isotropie mea¬ 
sure pi on M” and assume that for every t G M"', ||(t, •)||^2 — '^Pll some 
a > 1. If K G M”" is a eonvex symmetric body then with probability at least 
1 — exp(—ci/c/a^), 

diam(kerr H K) < rl{K), 

where 

rl{K) = inf {p > 0 : /9 > cia^L{K n . 

Our result is similar (though with a weaker estimate) to Theorem 14.11 
Other than the different ways of estimating the empirical process WpLj.—pL\\p 2 , 
the two proofs are identical, and thus the proof of Theorem 14.21 is omitted. 

Theorem 4.2 For every 0 < <5 < 1 there exist constant c{6) for which 
the following holds. Let pi be an isotropie, log-concave measure on M” and 
let K C M"" be a convex symmetric body. If Xi,..., X^ are independent, 
distributed according to pi, then with probability at least 1 — 5, 

diam(kerr 0 K) < ql{K), 

where 

ql{K) = inf |/9 > 0 : p > c(J) 72 (j^ O ^^ 2 ) ^ ’V’ 2 ) | ^ 

If p is a subgaussian measure then for every ^4 C M”, 72(^5 V'2) < c£*(A), 
and thus G 2 {K O pS'^~^,if 2 ) < Therefore, Theorem 14.21 recovers The¬ 
orem EH up to a ^/logn factor. Of course, the bound on the probability 
is considerably weaker. On the other hand. Theorem 14.21 holds for a much 
wider family of measures because the bound given in Theorem 14.1 1 depends 
on the equivalence constant between the 'ip 2 and (.2 metrics endowed on M"". 


4.1 Sampling from an isotropic, log-concave measure 


A question which was originally studied in 1311 EH E E] is the following: 
how many points sampled from an isotropic, convex, symmetric body are 
needed to ensure that the random operator almost 

isometric embedding of Ff ia ^ 2 ? In other words, that with probability at 
least 1 — (5, for every 9 G 5”“^, 


1 -e < 


1 

k 


k 

^{X,,d)^ <1+8. 

i=l 
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Theorem 4.3 mw For every 0 < e, <5 < 1 there is a constant c(e, 5) for 
which the following holds. Let Xi,be independent random variables, 
distributed according to the volume measure of a convex, symmetric body in 
isotropic position. If k > c{e,5)nlog^ n, then with probability at least 1 — 6, 
for every 6 G , 


1-e < - '^{0, Xff <l + e. 

2 = 1 

The estimate of A: ~ n log^ n was first proved by Rudelson m- Previously, 
Bourgain showed |SI how to obtain this result with a slightly weaker estimate 
of /c ~ n log^ n, but then Giannopoulos and Milman demonstrated that 
Bourgain’s method can actually give the same estimate as Rudelson’s. 

The proofs of Bourgain and Rudelson use very different arguments. 
Rudelson’s proof is based on a noncommutative Khintchine inequality, due 
to Lust-Piquard and Pisier cni, namely, a bound on Rademacher averages 
of the form E|| Yli=i for p > 1, where G £2 and 

are independent, symmetric, { — 1, l}-valued random variables. The fact that 
the set indexing the empirical process is exactly the Euclidean sphere is es¬ 
sential in the proof and the argument can not be modified to handle any 
other indexing sets - not even other subsets of the sphere. 

Bourgain’s proof uses a similar technique to the one we used here, which 
relies on the following version of Theorem 13. IL The formulation we present 
here is from [1]. 

Lemma 4.4 Let 6 G (0,1) and let Xi,...,Xk be points in sampled ac¬ 
cording to an isotropic log-concave measure. If k < c6 ex.p{^/n) then with 
probability at least 1 — 6, for every I C {1,..., A}, 




< ci(.5) )yiogfe\/] 7 I%/^ + 1^1 logfc). 


In particular, with probability at least 1 — 6, for every t > c{6) log k and 
every x G 


|{f : {x,Xi) > t}| < (4.2) 

Bourgain’s method was generalized in in which the following theorem 
was established: 
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Theorem 4.5 Let p > 0 and 0 < 5 < 1. There exists no (5) such that for 
every n > no{6), every log-concave measure on ET, every k > kQ{6,p) and 
every 9 G 5'"'“^, 

k 

i=l 

where Cp and Cp depend only on p and 


ko{6,p) 


c{5,p) < 


n 

n log^ n 

min{ (p — 2)~^, log n}(n log 


ifO <p<l, 
ifl<P<2, 
ifp > 2. 


Note that this bound is isomorphic in nature rather than almost isometric, 
though in the case p = 2 the proof of Theorem 14.51 can be modified to give 
an almost isometric estimate. 

Recently, Guedon and Rudelson P were able to bound 

k 

Ee sup^ei|(xi,?/)|P, 
y&K 

for any xi, ...,Xk G M”, where K C is a convex, symmetric body which 
has a g-power type modulus of convexity. The method of proof is based 
on majorizing measures, and can be used to bound E||/ifc — pWfp for F = 
) : X G K} as long as p > q > 2. It turns out that the dominant factor 
in the bound is 

(E max llXjlP -E max 

For K = i ?2 this approach yields the best known estimates for E||/xfc — 
for p > 2, and the resulting estimate on the required size of the sample is 
k ~ c{e, 6, p)n^^‘^ log n, and in particular, for p = 2 gives the best known 
estimate of /c ~ c{e,6)nlogn. Let us mention that for p = 2 this result is 
not helpful for “small” subsets of the sphere, and the best bound that one 
can establish for such subsets coincides with the one obtained for the whole 
sphere. 

All the known bounds, including [HI and ours, are based on the behavior 
of the random variable ||A1||. The best estimates on ||X|| are due to Paouris 

m-- 
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Theorem 4.6 There are absolute constants ci and C 2 for whieh the follow¬ 
ing holds. Let X he distributed aceording to an isotropie log-concave measure 
on Then, for every p < ciy/n, (E||Xp)^/^ < C 2 y/n. 

Theorem 14.61 immediately leads to a removal of a logarithmic factor in 
(lOll . though not to an improved level of truncation; thus, the estimate of 
Theorem 1131 in the case p = 2 remains unchanged despite the improved tail 
estimate. 

The properties of an isotropic log-concave measure which will be used 
below are that for suitable absolute constants C and Ci, 

1 . linear functionals have a subexponential tail - that is, for every x G M"", 

||(X,.)||v,, <C||x||, 

and 

2. By Theorem 14.6L for n < k < exp(c^/n), 

E max ll^ill =E max sup < Cl^/n. 


Therefore, in light of Theorem lT8l it is enough to consider the truncated 
measure n on M”' which is supported on a ball of radius C2{5)y/n to bound 
Whk — t\\fp- The main ingredient in our method is to bound 72 ( 5 '”“^, n) and 
to that end we shall estimate 

n 

= ]E|| 

i=l 

where gi, ..■,gn are standard, independent Gaussian variables. The particu¬ 
lar norm || ||_e we consider is the one endowed on M” by the 'ip 2 {E) structure, 
formally defined for every t G M"" by ||t||£; = 

Lemma 4.7 There exists an absolute eonstant c for which the following 
holds. Let n be a probability measure on R” and set Y to be distributed 
according to v. If Z = ||y|| and E = (R”, || Hi/jj); then Ie < c||Z||oo- 


Proof. Fix p to be named later and consider the gaussian vector G = 
{gi, ...,gn), where {gi)'f^i are independent, standard gaussian random vari¬ 
ables. Let ll.Z’lloo = D and since ||/||^2 — ® 6 xp(/^) then 


Ie 


<EyEg exp 




2 


24 



Recall that e^) is distributed as 5||h"||, and thus, 




<]Ey]En 


1 + E 


(E”.,9.(r,e.>) 


2m 


V 


m=l 


m\p2m 


< 


i + £—EH 

m\ 


2m 


2 = 1 


D 


2 m' 


=Eexp 


Dg 

P 


< 2 , 


if one selects p = cD. Therefore, Ie < cD, as claimed. 


Definition 4.8 For two sets A,Bc M”, let N(A, B) be the minimal number 
of translates of B needed to cover A, that is, the minimal cardinality of a 
set {xi, such that A C \J^i{B + xf). 

Note that if R is a unit ball of a norm on M”" then N{A, eB) are the covering 
numbers of A with respect to the metric endowed by B. 

Corollary 4.9 There exists an absolute constant c such that for every e > 

1 / 2 , 

Tl 

log N{B 2 ,eBE) < C—, 

and for 0 < e < 1/2, 

log N{B^,eBE) < cn log , 

where Be is the unit ball of (EA,'if 2 {E)) and Blf is the Euclidean unit ball. 

Proof. Recall that if Z = ||y|| then ||.^||oo < ciy^ for a suitable absolute 
constant. By the dual Sudakov Theorem logN{B 2 ,eBE) < C 2 ^_E/e^, 
and applying Lemma I4.7L < c^y/n, from which the first part of the claim 

follows. Turning to the second part, by a standard volumetric estimate (see, 
e.g. HO]), and since Be is a unit ball of a norm on N{^Be,£Be) < 
(l/2e)"'. Therefore, 

N{B^, eBe) < N (^Blf, • N < exp(c 4 n) . 
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Using Corollary 14.91 one can bound 72(-C,^2(*^)) by applying Theorem 
o for the space £2 which is 2-convex, and with d being the ' 4 ^ 2 {v) metric 
endowed on M”. 


Corollary 4.10 Let /U he an isotropic log-concave measure on ML and set v 
to he its truncation as above. Then for n < k < exp(ciy^), 

72(5'”“S V’2(z^)) < C 2 \/nlogn, 
where ci and C 2 are absolute constants. 

Proof. The proof is immediate from Theorem EH the entropy estimate in 
Corollary EM combined with the fact that for every 9 £ |(y, 0)| < 

||y|| < c^/n, and thus diam(S'"“^,^'2(2^)) < cV n log k < c^/n\ogn. ■ 

Let us remark that we believe this estimate is suboptimal by a factor of 
\/log n. 

Combining Corollary 14.101 with Theorem EH we obtain the following 
(most likely, suboptimal) estimate of \\pik — which we only state for 

p > 2. This estimate recovers the best known result for p > 2, and was 
originally established in [5]. 


Theorem 4.11 For every 0<e, 0<(5<1 and p > 2 there exists a 
constant c{s,5,p) for which the following holds. With probability at least 
1- S, ifk> ko, 


sup 

eeS"- 


1 ^ 

-^\{x„e)\p -n{x,9)\p 


< e, 


provided that ko > c{e,5,p)n^^‘^ logn forp > 2. 


Proof. Let n < k < exp(ci\/n). Using the notation of Theorem l.l.SI observe 
that 72(5'”“^,'02(z^)) < C 2 \/n logn, diam(5"'“^,'0i) < C2, < C 2 \/n. Also, 

if A: > Can logn, then for p > 2, 6 can be taken as 0 < C4 log logn, from 
which the claim is evident. ■ 

Let us remark that if one could select k < cn log n it would be possible 
to take 9 at the level of an absolute constant. This would be the case if the 
logarithmic term in the estimate on 72(6'"'“^, 02(z^)) were to be removed and 
would lead to the optimal bound for any p > 1. 
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